Search CORE

13 research outputs found

The pig X and Y Chromosomes: structure, sequence, and evolution.

We have generated an improved assembly and gene annotation of the pig X Chromosome, and a first draft assembly of the pig Y Chromosome, by sequencing BAC and fosmid clones from Duroc animals and incorporating information from optical mapping and fiber-FISH. The X Chromosome carries 1033 annotated genes, 690 of which are protein coding. Gene order closely matches that found in primates (including humans) and carnivores (including cats and dogs), which is inferred to be ancestral. Nevertheless, several protein-coding genes present on the human X Chromosome were absent from the pig, and 38 pig-specific X-chromosomal genes were annotated, 22 of which were olfactory receptors. The pig Y-specific Chromosome sequence generated here comprises 30 megabases (Mb). A 15-Mb subset of this sequence was assembled, revealing two clusters of male-specific low copy number genes, separated by an ampliconic region including the HSFY gene family, which together make up most of the short arm. Both clusters contain palindromes with high sequence identity, presumably maintained by gene conversion. Many of the ancestral X-related genes previously reported in at least one mammalian Y Chromosome are represented either as active genes or partial sequences. This sequencing project has allowed us to identify genes--both single copy and amplified--on the pig Y Chromosome, to compare the pig X and Y Chromosomes for homologous sequences, and thereby to reveal mechanisms underlying pig X and Y Chromosome evolution.This work was funded by BBSRC grant BB/F021372/1. The Flow Cytometry and Cytogenetics Core Facilities at the Wellcome Trust Sanger Institute and Sanger investigators are funded by the Wellcome Trust (grant number WT098051). K.B., D.C.-S., and J.H. acknowledge support from the Wellcome Trust (WT095908), the BBSRC (BB/I025506/1), and the European Molecular Biology Laboratory. The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007–2013) under grant agreement no. 222664 (“Quantomics”).This is the final version of the article. It first appeared from Cold Spring Harbor Laboratory Press via http://dx.doi.org/10.1101/gr.188839.11

University of Essex Research Repository

PubMed Central

UCL Discovery

Kent Academic Repository

Apollo (Cambridge)

RNAcentral: A vision for an international database of RNA sequences

Author: Agrawal Shipra
Bateman Alex
Birney Ewan
Bruford Elspeth A
Bujnicki Janusz M
Cochrane Guy
Cole James R
Dinger Marcel E
Enright Anton J
Gardner Paul P
Gautheret Daniel
Griffiths-Jones Sam
Harrow Jen
Herrero Javier
Holmes Ian H
Huang Hsien-Da
Kelly Krystyna A
Kersey Paul
Kozomara Ana
Lowe Todd M
Marz Manja
Moxon Simon
Pruitt Kim D
Samuelsson Tore
Stadler Peter F
Vilella Albert J
Vogel Jan-Hinnerk
Williams Kelly P
Wright Mathew W
Zwieb Christian
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 23/09/2011
Field of study

During the last decade there has been a great increase in the number of noncoding RNA genes identified, including new classes such as microRNAs and piRNAs. There is also a large growth in the amount of experimental characterization of these RNA components. Despite this growth in information, it is still difficult for researchers to access RNA data, because key data resources for noncoding RNAs have not yet been created. The most pressing omission is the lack of a comprehensive RNA sequence database, much like UniProt, which provides a comprehensive set of protein knowledge. In this article we propose the creation of a new open public resource that we term RNAcentral, which will contain a comprehensive collection of RNA sequences and fill an important gap in the provision of biomedical databases. We envision RNA researchers from all over the world joining a federated RNAcentral network, contributing specialized knowledge and databases. RNAcentral would centralize key data that are currently held across a variety of databases, allowing researchers instant access to a single, unified resource. This resource would facilitate the next generation of RNA research and help drive further discoveries, including those that improve food production and human and animal health. We encourage additional RNA database resources and research groups to join this effort. We aim to obtain international network funding to further this endeavor

Crossref

UCL Discovery

PubMed Central

The University of Manchester - Institutional Repository

University of East Anglia digital repository

An intrinsically disordered proteins community for ELIXIR.

Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) are now recognised as major determinants in cellular regulation. This white paper presents a roadmap for future e-infrastructure developments in the field of IDP research within the ELIXIR framework. The goal of these developments is to drive the creation of high-quality tools and resources to support the identification, analysis and functional characterisation of IDPs. The roadmap is the result of a workshop titled "An intrinsically disordered protein user community proposal for ELIXIR" held at the University of Padua. The workshop, and further consultation with the members of the wider IDP community, identified the key priority areas for the roadmap including the development of standards for data annotation, storage and dissemination; integration of IDP data into the ELIXIR Core Data Resources; and the creation of benchmarking criteria for IDP-related software. Here, we discuss these areas of priority, how they can be implemented in cooperation with the ELIXIR platforms, and their connections to existing ELIXIR Communities and international consortia. The article provides a preliminary blueprint for an IDP Community in ELIXIR and is an appeal to identify and involve new stakeholders

Maastricht University Research Portal

Birkbeck Institutional Research Online

ZORA

Repository of the Academy's Library

UPF Digital Repository

Apollo (Cambridge)

Institute of Cancer Research Repository

Archivio istituzionale della ricerca - Università di Padova

Comparative analysis of the transcriptome across distant species

Author: Adam Frankish
Alex Dobin
Alexandre Reymond
Ali Mortazavi
Anastasia Samsonova
Andrea Tanzer
Ann Hammonds
Anurag Sethi
Arif O. Harmanci
AT Kalinka
Baikang Pei
Benjamin W. Booth
BR Graveley
Brent Ewing
Brenton R. Graveley
Brian Oliver
Burak H. Alver
Carrie A. Davis
Chao Cheng
Chao Di
Chau Huynh
Chenghai Xue
Chris Zaleski
Cristina Sisu
Cédric Howald
D Brawand
Daifeng Wang
David M. Miller
DF Simola
Dionna Kasper
Dmitri Pervouchine
Elise A. Feingold
Eric Lai
Erik Ladewig
Felix Schlesinger
Frank J. Slack
Gang Fang
Garrett Robinson
Gary I. Saunders
Gemma May
Gennifer Merrihew
Guanjun Gao
Guilin Wang
Haiyan Huang
Henry Zheng
Huaien Wang
J Merkin
J Reichardt
James B. Brown
Jen Harrow
Jiayu Wen
Jing Leng
Jingyi Jessica Li
JJ Li
JM Stuart
Joel Rozowsky
Jorg Drenkow
Julien Lagarde
Kathie L. Watkins
Kejia Wen
Kenneth H. Wan
Kevin Yip
Kimberly Bell
KK Yan
Koon-Kiu Yan
LaDeana Hillier
Li Yang
Long Hu
Lucy Cherbas
M Levin
M Talerico
Marcus H. Stoiber
Mark B. Gerstein
Masaomi Kato
Max E. Boeck
MB Gerstein
Megan Fastuca
Michael J. Pazin
Michael MacCoss
Michael O. Duff
modENCODE Consortium
Nathan P. Boley
NL Barbosa-Morais
Norbert Perrimon
Owen A. Thompson
Peter Cherbas
Peter J. Bickel
Peter J. Good
Peter J. Park
Pnina Strasbourger
R Karlić
Rabi Murad
Raymond Auerbach
Rebecca McWhirter
Robert R. Kitchen
Robert Waterston
Roderic Guigó
Roger A. Hoskins
Roger P. Alexander
S Djebali
S Kirkpatrick
Sara Olson
Sarah Djebali
Sonali Jha
Steven E. Brenner
Susan E. Celniker
T Domazet-Lošo
Thomas C. Kaufman
Thomas R. Gingeras
Tim J. P. Hubbard
Valerie Reinke
William C. Spencer
Yan Zhang
Zhi Lu
ZJ Lu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters

Crossref

Cold Spring Harbor Laboratory Institutional Repository

University of Birmingham Research Portal

Harvard University - DASH

Serveur académique lausannois

PubMed Central

eScholarship - University of California

UPF Digital Repository

King's Research Portal

Brunel University Research Archive

ELIXIR Europe on the Road to Sustainable Research Software

Author: Harrow Jen
Kuzak Mateusz
Martinez Paula
Psomopoulos Fotis
Via Allegra
Publication venue: Pensoft Publishers
Publication date: 01/01/2019
Field of study

ELIXIR (ELIXIR Europe 2019a) is an intergovernmental organization that brings together life science resources across Europe. These resources include databases, software tools, training materials, cloud storage, and supercomputers. One of the goals of ELIXIR is to coordinate these resources so that they form a single infrastructure. This infrastructure makes it easier for scientists to find and share data, exchange expertise, and agree on best practices. ELIXIR's activities are divided into the following five areas: Data, Tools, Interoperability, Compute and Training, each known as “platform”. The ELIXIR Tools Platform works to improve the discovery, quality and sustainability of software resources. The Software Development Best Practices task of the Tools Platform aims to raise the quality and sustainability of research software by producing, adopting, and promoting information standards and best practices relevant to the software development life cycle. We have published four (4OSS) simple recommendations to encourage best practices in research software (Jiménez et al. 2017) and the Top 10 metrics for recommended life science software practices (Artaza et al. 2016). The 4OSS simple recommendations are as follows: Develop a publicly accessible open source code from day one. Make software easy to discover by providing software metadata via a popular community registry. Adopt a license and comply with the licenses of third-party dependencies. Have clear and transparent contribution, governance and communication processes. In order to encourage researchers and developers to adopt the 4OSS recommendations and build FAIR (Findable, Accessible, Interoperable and Reusable) software, the best practices group, in partnership with the ELIXIR Training platform, The Carpentries (Carpentries 2019, ELIXIR Europe 2019b), and other communities, are creating a collection of training materials (Kuzak et al. 2019). The next step is to adopt, promote, and recognise these information standards and best practices. The group will address this by (i) developing comprehensive guidelines for software curation, (ii) through training researchers and developers towards the adoption of software best practices and (iii) improvement of the usability of Tools Platform products. Additionally, a direct outcome of this task will be a software management plan template, connected to a concise description of the guidelines for open research software; and production of a white paper for the software development management plan for ELIXIR, which can be consequently used to produce training materials. We will work with the newly formed ReSA (Research Software Alliance) to facilitate the adoption of this plan for the broader community

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

ARPHA OAI-PMH Endpoint

ARPHA Preprints

Lesson Development for Open Source Software Best Practices Adoption

Author: Harrow Jen
Jimenez Rafael C.
Kuzak Mateusz
Martinez Paula Andrea
Psomopoulos Fotis E.
Svobodova Varekova Radka
Via Allegra
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2018
Field of study

Crossref

University of Queensland eSpace

Recommended from our members

An intrinsically disordered proteins community for ELIXIR.

Apollo (Cambridge)

The GENCODE exome:sequencing the complete human exome

Author: A Gnirke
Aarno Palotie
Alison J Coffey
Anna-Elina Lehesjoki
AR Quinlan
CA Hübner
Carol E Scott
Christopher J Joyce
Daniel J Turner
DT Okou
E Hodges
E Kalay
Eleanor Drury
Emily M LeProust
ER Mardis
Felix Kokocinski
H Li
H Li
J Harrow
J Shendure
Jen Harrow
KD Pruitt
LG Wilming
M Choi
Maria S Calafato
P Flicek
PA Futreal
Priit Palta
Sarah Hunt
SB Ng
SB Ng
SB Ng
ST Sherry
Tim J Hubbard
TJ Albert
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/03/2011
Field of study

Sequencing the coding regions, the exome, of the human genome is one of the major current strategies to identify low frequency and rare variants associated with human disease traits. So far, the most widely used commercial exome capture reagents have mainly targeted the consensus coding sequence (CCDS) database. We report the design of an extended set of targets for capturing the complete human exome, based on annotation from the GENCODE consortium. The extended set covers an additional 5594 genes and 10.3 Mb compared with the current CCDS-based sets. The additional regions include potential disease genes previously inaccessible to exome resequencing studies, such as 43 genes linked to ion channel activity and 70 genes linked to protein kinase activity. In total, the new GENCODE exome set developed here covers 47.9 Mb and performed well in sequence capture experiments. In the sample set used in this study, we identified over 5000 SNP variants more in the GENCODE exome target (24%) than in the CCDS-based exome sequencing

Crossref

PubMed Central

King's Research Portal

University of Melbourne Institutional Repository

The GENCODE exome: sequencing the complete human exome

Author: A Gnirke
Aarno Palotie
Alison J Coffey
Anna-Elina Lehesjoki
AR Quinlan
CA Hübner
Carol E Scott
Christopher J Joyce
Daniel J Turner
DT Okou
E Hodges
E Kalay
Eleanor Drury
Emily M LeProust
ER Mardis
Felix Kokocinski
H Li
H Li
J Harrow
J Shendure
Jen Harrow
KD Pruitt
LG Wilming
M Choi
Maria S Calafato
P Flicek
PA Futreal
Priit Palta
Sarah Hunt
SB Ng
SB Ng
SB Ng
ST Sherry
Tim J Hubbard
TJ Albert
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Genome-wide end-sequenced BAC resources for the NOD/MrkTac☆ and NOD/ShiLtJ☆☆ mouse genomes

Author: Abe
Adams
Anderson
Antoch
Atkinson
Bentley
Bob Plumb
Charles A. Steward
David J. Adams
Devendra
Dowell
Erlich
Frengen
Gregory
Hanna
Hubbard
Ivakine
James Bonfield
Jane Rogers
Jayne Danska
Jen Harrow
John Todd
Leiter
Linda Wicker
Makino
Matthew C. Jones
Michael A. Quail
Michael Nefedov
Nichols
Ning
Omid Gulban
Osoegawa
Paul Lyons
Pieter J. de Jong
Ridgway
Rob Davies
Sean Humphray
Stephen Rice
te Riele
Thomas M. Keane
Tim Hubbard
Todd
Tony Cox
Wendl
Wicker
Yoshihide Hayashizaki
Zhao
Publication venue: Academic Press
Publication date: 01/02/2010
Field of study

Non-obese diabetic (NOD) mice spontaneously develop type 1 diabetes (T1D) due to the progressive loss of insulin-secreting β-cells by an autoimmune driven process. NOD mice represent a valuable tool for studying the genetics of T1D and for evaluating therapeutic interventions. Here we describe the development and characterization by end-sequencing of bacterial artificial chromosome (BAC) libraries derived from NOD/MrkTac (DIL NOD) and NOD/ShiLtJ (CHORI-29), two commonly used NOD substrains. The DIL NOD library is composed of 196,032 BACs and the CHORI-29 library is composed of 110,976 BACs. The average depth of genome coverage of the DIL NOD library, estimated from mapping the BAC end-sequences to the reference mouse genome sequence, was 7.1-fold across the autosomes and 6.6-fold across the X chromosome. Clones from this library have an average insert size of 150 kb and map to over 95.6% of the reference mouse genome assembly (NCBIm37), covering 98.8% of Ensembl mouse genes. By the same metric, the CHORI-29 library has an average depth over the autosomes of 5.0-fold and 2.8-fold coverage of the X chromosome, the reduced X chromosome coverage being due to the use of a male donor for this library. Clones from this library have an average insert size of 205 kb and map to 93.9% of the reference mouse genome assembly, covering 95.7% of Ensembl genes. We have identified and validated 191,841 single nucleotide polymorphisms (SNPs) for DIL NOD and 114,380 SNPs for CHORI-29. In total we generated 229,736,133 bp of sequence for the DIL NOD and 121,963,211 bp for the CHORI-29. These BAC libraries represent a powerful resource for functional studies, such as gene targeting in NOD embryonic stem (ES) cell lines, and for sequencing and mapping experiments

Elsevier - Publisher Connector

Crossref

PubMed Central

King's Research Portal